Overview

Dataset statistics

Number of variables20
Number of observations757349
Missing cells1416017
Missing cells (%)9.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory115.6 MiB
Average record size in memory160.0 B

Variable types

Categorical10
Numeric7
Unsupported2
Boolean1

Warnings

State has constant value "Adamawa" Constant
Regimen has a high cardinality: 109 distinct values High cardinality
PHARMACY_ID is highly correlated with PATIENT_ID and 2 other fieldsHigh correlation
PATIENT_ID is highly correlated with PHARMACY_ID and 1 other fieldsHigh correlation
FACILITY_ID is highly correlated with PHARMACY_ID and 1 other fieldsHigh correlation
ADHERENCE is highly correlated with PHARMACY_IDHigh correlation
PATIENT_ID is highly correlated with FACILITY_IDHigh correlation
FACILITY_ID is highly correlated with PATIENT_IDHigh correlation
PATIENT_ID is highly correlated with FACILITY_IDHigh correlation
FACILITY_ID is highly correlated with PATIENT_IDHigh correlation
ADR_IDS is highly correlated with PHARMACY_ID and 5 other fieldsHigh correlation
PHARMACY_ID is highly correlated with ADR_IDS and 6 other fieldsHigh correlation
Regimen Line is highly correlated with ADR_IDSHigh correlation
AFTERNOON is highly correlated with EVENINGHigh correlation
DMOC_TYPE is highly correlated with PHARMACY_ID and 3 other fieldsHigh correlation
L.G.A is highly correlated with ADR_IDS and 6 other fieldsHigh correlation
ADHERENCE is highly correlated with ADR_IDS and 5 other fieldsHigh correlation
PATIENT_ID is highly correlated with ADR_IDS and 5 other fieldsHigh correlation
EVENING is highly correlated with AFTERNOONHigh correlation
Facility Name is highly correlated with ADR_IDS and 6 other fieldsHigh correlation
FACILITY_ID is highly correlated with PHARMACY_ID and 5 other fieldsHigh correlation
ADR_SCREENED has 79955 (10.6%) missing values Missing
ADR_IDS has 757310 (> 99.9%) missing values Missing
DMOC_TYPE has 578741 (76.4%) missing values Missing
MORNING is highly skewed (γ1 = 160.802169) Skewed
BODY_WEIGHT is highly skewed (γ1 = 28.28863313) Skewed
PHARMACY_ID has unique values Unique
DATE_VISIT is an unsupported type, check if it needs cleaning or further analysis Unsupported
NEXT_APPOINTMENT is an unsupported type, check if it needs cleaning or further analysis Unsupported
MORNING has 301222 (39.8%) zeros Zeros
EVENING has 201465 (26.6%) zeros Zeros
BODY_WEIGHT has 755118 (99.7%) zeros Zeros

Reproduction

Analysis started2021-06-15 09:14:13.063728
Analysis finished2021-06-15 09:15:41.151713
Duration1 minute and 28.09 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

State
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.8 MiB
Adamawa
757349 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters5301443
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAdamawa
2nd rowAdamawa
3rd rowAdamawa
4th rowAdamawa
5th rowAdamawa

Common Values

ValueCountFrequency (%)
Adamawa757349
100.0%

Length

2021-06-15T09:15:41.518363image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T09:15:41.589739image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
adamawa757349
100.0%

Most occurring characters

ValueCountFrequency (%)
a2272047
42.9%
A757349
 
14.3%
d757349
 
14.3%
m757349
 
14.3%
w757349
 
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4544094
85.7%
Uppercase Letter757349
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a2272047
50.0%
d757349
 
16.7%
m757349
 
16.7%
w757349
 
16.7%
Uppercase Letter
ValueCountFrequency (%)
A757349
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5301443
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a2272047
42.9%
A757349
 
14.3%
d757349
 
14.3%
m757349
 
14.3%
w757349
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII5301443
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a2272047
42.9%
A757349
 
14.3%
d757349
 
14.3%
m757349
 
14.3%
w757349
 
14.3%

L.G.A
Categorical

HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.8 MiB
Mubi South
316231 
Song
120950 
Numan
116869 
Michika
91173 
Hong
57359 
Other values (5)
54767 

Length

Max length10
Median length7
Mean length7.093749381
Min length4

Characters and Unicode

Total characters5372444
Distinct characters25
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGirei
2nd rowGirei
3rd rowGirei
4th rowGirei
5th rowGirei

Common Values

ValueCountFrequency (%)
Mubi South316231
41.8%
Song120950
 
16.0%
Numan116869
 
15.4%
Michika91173
 
12.0%
Hong57359
 
7.6%
Gayuk38373
 
5.1%
Girei7252
 
1.0%
Maiha5737
 
0.8%
Demsa3236
 
0.4%
Madagali169
 
< 0.1%

Length

2021-06-15T09:15:41.790670image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T09:15:41.880016image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
mubi316231
29.5%
south316231
29.5%
song120950
 
11.3%
numan116869
 
10.9%
michika91173
 
8.5%
hong57359
 
5.3%
gayuk38373
 
3.6%
girei7252
 
0.7%
maiha5737
 
0.5%
demsa3236
 
0.3%

Most occurring characters

ValueCountFrequency (%)
u787704
14.7%
i518987
9.7%
o494540
9.2%
S437181
8.1%
M413310
 
7.7%
h413141
 
7.7%
b316231
 
5.9%
316231
 
5.9%
t316231
 
5.9%
n295178
 
5.5%
Other values (15)1063710
19.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3982633
74.1%
Uppercase Letter1073580
 
20.0%
Space Separator316231
 
5.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u787704
19.8%
i518987
13.0%
o494540
12.4%
h413141
10.4%
b316231
7.9%
t316231
7.9%
n295178
 
7.4%
a261632
 
6.6%
g178478
 
4.5%
k129546
 
3.3%
Other values (8)270965
 
6.8%
Uppercase Letter
ValueCountFrequency (%)
S437181
40.7%
M413310
38.5%
N116869
 
10.9%
H57359
 
5.3%
G45625
 
4.2%
D3236
 
0.3%
Space Separator
ValueCountFrequency (%)
316231
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5056213
94.1%
Common316231
 
5.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
u787704
15.6%
i518987
10.3%
o494540
9.8%
S437181
8.6%
M413310
8.2%
h413141
8.2%
b316231
 
6.3%
t316231
 
6.3%
n295178
 
5.8%
a261632
 
5.2%
Other values (14)802078
15.9%
Common
ValueCountFrequency (%)
316231
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII5372444
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
u787704
14.7%
i518987
9.7%
o494540
9.2%
S437181
8.1%
M413310
 
7.7%
h413141
 
7.7%
b316231
 
5.9%
316231
 
5.9%
t316231
 
5.9%
n295178
 
5.5%
Other values (15)1063710
19.8%

Facility Name
Categorical

HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.8 MiB
Mubi General Hospital
316231 
Song Cottage Hospital
120950 
Numan General Hospital
116869 
Michika General Hospital
91173 
Hong Cottage Hospital
57359 
Other values (5)
54767 

Length

Max length24
Median length21
Mean length21.51972208
Min length14

Characters and Unicode

Total characters16297940
Distinct characters26
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGirei B Clinic
2nd rowGirei B Clinic
3rd rowGirei B Clinic
4th rowGirei B Clinic
5th rowGirei B Clinic

Common Values

ValueCountFrequency (%)
Mubi General Hospital316231
41.8%
Song Cottage Hospital120950
 
16.0%
Numan General Hospital116869
 
15.4%
Michika General Hospital91173
 
12.0%
Hong Cottage Hospital57359
 
7.6%
Guyuk General Hospital38373
 
5.1%
Girei B Clinic7252
 
1.0%
Maiha Cottage Hospital5737
 
0.8%
Borrong General Hospital3236
 
0.4%
Cottage Hospital Gulak169
 
< 0.1%

Length

2021-06-15T09:15:42.154055image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T09:15:42.236863image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
hospital750097
33.0%
general565882
24.9%
mubi316231
13.9%
cottage184215
 
8.1%
song120950
 
5.3%
numan116869
 
5.1%
michika91173
 
4.0%
hong57359
 
2.5%
guyuk38373
 
1.7%
b7252
 
0.3%
Other values (5)23646
 
1.0%

Most occurring characters

ValueCountFrequency (%)
a1719879
 
10.6%
1514698
 
9.3%
l1323400
 
8.1%
e1323231
 
8.1%
i1283419
 
7.9%
o1119093
 
6.9%
t1118527
 
6.9%
n871548
 
5.3%
H807456
 
5.0%
s750097
 
4.6%
Other values (16)4466592
27.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter12511195
76.8%
Uppercase Letter2272047
 
13.9%
Space Separator1514698
 
9.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a1719879
13.7%
l1323400
10.6%
e1323231
10.6%
i1283419
10.3%
o1119093
8.9%
t1118527
8.9%
n871548
7.0%
s750097
6.0%
p750097
6.0%
r579606
 
4.6%
Other values (8)1672298
13.4%
Uppercase Letter
ValueCountFrequency (%)
H807456
35.5%
G611676
26.9%
M413141
18.2%
C191467
 
8.4%
S120950
 
5.3%
N116869
 
5.1%
B10488
 
0.5%
Space Separator
ValueCountFrequency (%)
1514698
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin14783242
90.7%
Common1514698
 
9.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a1719879
11.6%
l1323400
 
9.0%
e1323231
 
9.0%
i1283419
 
8.7%
o1119093
 
7.6%
t1118527
 
7.6%
n871548
 
5.9%
H807456
 
5.5%
s750097
 
5.1%
p750097
 
5.1%
Other values (15)3716495
25.1%
Common
ValueCountFrequency (%)
1514698
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII16297940
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a1719879
 
10.6%
1514698
 
9.3%
l1323400
 
8.1%
e1323231
 
8.1%
i1283419
 
7.9%
o1119093
 
6.9%
t1118527
 
6.9%
n871548
 
5.3%
H807456
 
5.0%
s750097
 
4.6%
Other values (16)4466592
27.4%

Regimen Line
Categorical

HIGH CORRELATION

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.8 MiB
ART First Line Adult
616043 
Cotrimoxazole (CTX) Prophylaxis
87330 
Isoniazid Preventive Therapy (IPT)
 
26129
ART First Line Children
 
21648
ART Second Line Adult
 
5215
Other values (9)
 
984

Length

Max length46
Median length20
Mean length21.84627167
Min length10

Characters and Unicode

Total characters16545252
Distinct characters40
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowART First Line Adult
2nd rowIsoniazid Preventive Therapy (IPT)
3rd rowART First Line Adult
4th rowART First Line Adult
5th rowART First Line Adult

Common Values

ValueCountFrequency (%)
ART First Line Adult616043
81.3%
Cotrimoxazole (CTX) Prophylaxis87330
 
11.5%
Isoniazid Preventive Therapy (IPT)26129
 
3.5%
ART First Line Children21648
 
2.9%
ART Second Line Adult5215
 
0.7%
OI Treatment341
 
< 0.1%
ART Second Line Children222
 
< 0.1%
ARV Prophylaxis for Pregnant Women117
 
< 0.1%
Other Medicines114
 
< 0.1%
Other anti-infectives (including STI Medicine)91
 
< 0.1%
Other values (4)99
 
< 0.1%

Length

2021-06-15T09:15:42.532818image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
line643130
21.9%
art643128
21.9%
first637691
21.7%
adult621316
21.1%
prophylaxis87477
 
3.0%
cotrimoxazole87330
 
3.0%
ctx87330
 
3.0%
preventive26129
 
0.9%
isoniazid26129
 
0.9%
therapy26129
 
0.9%
Other values (18)55504
 
1.9%

Most occurring characters

ValueCountFrequency (%)
2183944
13.2%
i1556761
 
9.4%
t1373816
 
8.3%
A1264591
 
7.6%
r887514
 
5.4%
e864139
 
5.2%
l818093
 
4.9%
T783284
 
4.7%
s751532
 
4.5%
n724092
 
4.4%
Other values (30)5337486
32.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter9679095
58.5%
Uppercase Letter4455022
26.9%
Space Separator2183944
 
13.2%
Open Punctuation113550
 
0.7%
Close Punctuation113550
 
0.7%
Dash Punctuation91
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i1556761
16.1%
t1373816
14.2%
r887514
9.2%
e864139
8.9%
l818093
8.5%
s751532
7.8%
n724092
7.5%
d675059
7.0%
u621407
 
6.4%
o381297
 
3.9%
Other values (11)1025385
10.6%
Uppercase Letter
ValueCountFrequency (%)
A1264591
28.4%
T783284
17.6%
R643275
14.4%
L643130
14.4%
F637691
14.3%
C196539
 
4.4%
P139852
 
3.1%
X87330
 
2.0%
I52720
 
1.2%
S5528
 
0.1%
Other values (5)1082
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2183944
100.0%
Open Punctuation
ValueCountFrequency (%)
(113550
100.0%
Close Punctuation
ValueCountFrequency (%)
)113550
100.0%
Dash Punctuation
ValueCountFrequency (%)
-91
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin14134117
85.4%
Common2411135
 
14.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
i1556761
 
11.0%
t1373816
 
9.7%
A1264591
 
8.9%
r887514
 
6.3%
e864139
 
6.1%
l818093
 
5.8%
T783284
 
5.5%
s751532
 
5.3%
n724092
 
5.1%
d675059
 
4.8%
Other values (26)4435236
31.4%
Common
ValueCountFrequency (%)
2183944
90.6%
(113550
 
4.7%
)113550
 
4.7%
-91
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII16545252
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2183944
13.2%
i1556761
 
9.4%
t1373816
 
8.3%
A1264591
 
7.6%
r887514
 
5.4%
e864139
 
5.2%
l818093
 
4.9%
T783284
 
4.7%
s751532
 
4.5%
n724092
 
4.4%
Other values (30)5337486
32.3%

Regimen
Categorical

HIGH CARDINALITY

Distinct109
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.8 MiB
TDF(300mg)+3TC(300mg)+DTG(50mg)
231946 
TDF(300mg)+3TC(300mg)+EFV(600mg)
227793 
AZT(300mg)+3TC(150mg)+NVP(200mg)
128173 
Cotrimoxazole 960mg
85378 
Isoniazid 300mg
25465 
Other values (104)
58594 

Length

Max length62
Median length32
Mean length29.6154547
Min length10

Characters and Unicode

Total characters22429235
Distinct characters57
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st rowTDF(300mg)+3TC(300mg)+DTG(50mg)
2nd rowIsoniazid 300mg
3rd rowTDF(300mg)+3TC(300mg)+DTG(50mg)
4th rowTDF(300mg)+3TC(300mg)+EFV(600mg)
5th rowTDF(300mg)+3TC(300mg)+EFV(600mg)

Common Values

ValueCountFrequency (%)
TDF(300mg)+3TC(300mg)+DTG(50mg)231946
30.6%
TDF(300mg)+3TC(300mg)+EFV(600mg)227793
30.1%
AZT(300mg)+3TC(150mg)+NVP(200mg)128173
16.9%
Cotrimoxazole 960mg85378
 
11.3%
Isoniazid 300mg25465
 
3.4%
AZT(300mg)+3TC(150mg)+EFV(600mg)9576
 
1.3%
AZT(10mg/ml)+3TC(10mg/ml)+NVP(10mg/ml)7018
 
0.9%
TDF(300mg)+3TC(300mg)+NVP(200mg)5002
 
0.7%
ABC(60mg)+3TC(30mg)+LPV/r(40/10mg)3699
 
0.5%
TDF(300mg)+3TC(30mg)+DTG(50mg)3497
 
0.5%
Other values (99)29802
 
3.9%

Length

2021-06-15T09:15:42.824162image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
tdf(300mg)+3tc(300mg)+dtg(50mg231946
26.6%
tdf(300mg)+3tc(300mg)+efv(600mg227793
26.1%
azt(300mg)+3tc(150mg)+nvp(200mg128173
14.7%
cotrimoxazole87803
 
10.1%
960mg85378
 
9.8%
isoniazid26082
 
3.0%
300mg25466
 
2.9%
azt(300mg)+3tc(150mg)+efv(600mg9576
 
1.1%
azt(10mg/ml)+3tc(10mg/ml)+nvp(10mg/ml7018
 
0.8%
tdf(300mg)+3tc(300mg)+nvp(200mg5002
 
0.6%
Other values (122)37520
 
4.3%

Most occurring characters

ValueCountFrequency (%)
03565814
15.9%
m2139609
9.5%
g2026240
9.0%
(1910887
 
8.5%
)1910887
 
8.5%
31771226
 
7.9%
T1518806
 
6.8%
+1268000
 
5.7%
C739698
 
3.3%
F725335
 
3.2%
Other values (47)4852733
21.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number6499858
29.0%
Lowercase Letter5384861
24.0%
Uppercase Letter5259123
23.4%
Open Punctuation1910887
 
8.5%
Close Punctuation1910887
 
8.5%
Math Symbol1268000
 
5.7%
Space Separator114408
 
0.5%
Other Punctuation81211
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m2139609
39.7%
g2026240
37.6%
o290407
 
5.4%
i140738
 
2.6%
a114622
 
2.1%
z114352
 
2.1%
l113913
 
2.1%
r97627
 
1.8%
e88490
 
1.6%
t88243
 
1.6%
Other values (12)170620
 
3.2%
Uppercase Letter
ValueCountFrequency (%)
T1518806
28.9%
C739698
14.1%
F725335
13.8%
D714998
13.6%
V403884
 
7.7%
E243162
 
4.6%
G238131
 
4.5%
A166635
 
3.2%
P158743
 
3.0%
Z156095
 
3.0%
Other values (9)193636
 
3.7%
Decimal Number
ValueCountFrequency (%)
03565814
54.9%
31771226
27.3%
5396126
 
6.1%
6336577
 
5.2%
1180403
 
2.8%
2153374
 
2.4%
985378
 
1.3%
48959
 
0.1%
81726
 
< 0.1%
7275
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
/81209
> 99.9%
,2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
(1910887
100.0%
Close Punctuation
ValueCountFrequency (%)
)1910887
100.0%
Math Symbol
ValueCountFrequency (%)
+1268000
100.0%
Space Separator
ValueCountFrequency (%)
114408
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common11785251
52.5%
Latin10643984
47.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
m2139609
20.1%
g2026240
19.0%
T1518806
14.3%
C739698
 
6.9%
F725335
 
6.8%
D714998
 
6.7%
V403884
 
3.8%
o290407
 
2.7%
E243162
 
2.3%
G238131
 
2.2%
Other values (31)1603714
15.1%
Common
ValueCountFrequency (%)
03565814
30.3%
(1910887
16.2%
)1910887
16.2%
31771226
15.0%
+1268000
 
10.8%
5396126
 
3.4%
6336577
 
2.9%
1180403
 
1.5%
2153374
 
1.3%
114408
 
1.0%
Other values (6)177549
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII22429235
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
03565814
15.9%
m2139609
9.5%
g2026240
9.0%
(1910887
 
8.5%
)1910887
 
8.5%
31771226
 
7.9%
T1518806
 
6.8%
+1268000
 
5.7%
C739698
 
3.3%
F725335
 
3.2%
Other values (47)4852733
21.6%

PHARMACY_ID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct757349
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1910151.303
Minimum209355
Maximum4080353
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.8 MiB
2021-06-15T09:15:42.964526image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum209355
5-th percentile398867.4
Q1768582
median1750427
Q33166281
95-th percentile3804175.6
Maximum4080353
Range3870998
Interquartile range (IQR)2397699

Descriptive statistics

Standard deviation1236196.862
Coefficient of variation (CV)0.6471722212
Kurtosis-1.332274199
Mean1910151.303
Median Absolute Deviation (MAD)1109174
Skewness0.4037218083
Sum1.446651179 × 1012
Variance1.528182681 × 1012
MonotonicityStrictly increasing
2021-06-15T09:15:43.104695image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31477771
 
< 0.1%
37093221
 
< 0.1%
32244351
 
< 0.1%
11211421
 
< 0.1%
16195091
 
< 0.1%
37120981
 
< 0.1%
37284171
 
< 0.1%
37481151
 
< 0.1%
30984981
 
< 0.1%
11375341
 
< 0.1%
Other values (757339)757339
> 99.9%
ValueCountFrequency (%)
2093551
< 0.1%
2093621
< 0.1%
2093681
< 0.1%
2093751
< 0.1%
2093821
< 0.1%
2093891
< 0.1%
2093961
< 0.1%
2094021
< 0.1%
2094071
< 0.1%
2094151
< 0.1%
ValueCountFrequency (%)
40803531
< 0.1%
40803521
< 0.1%
40803511
< 0.1%
40803501
< 0.1%
40803491
< 0.1%
40803471
< 0.1%
40803461
< 0.1%
40803451
< 0.1%
40803441
< 0.1%
40803431
< 0.1%

PATIENT_ID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct23093
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69294.81317
Minimum37869
Maximum160842
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.8 MiB
2021-06-15T09:15:43.253986image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum37869
5-th percentile38903.4
Q143760
median49270
Q355171
95-th percentile153578
Maximum160842
Range122973
Interquartile range (IQR)11411

Descriptive statistics

Standard deviation42509.01086
Coefficient of variation (CV)0.6134515545
Kurtosis-0.1400648017
Mean69294.81317
Median Absolute Deviation (MAD)5699
Skewness1.322614282
Sum5.248035746 × 1010
Variance1807016004
MonotonicityNot monotonic
2021-06-15T09:15:43.396152image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
150928352
 
< 0.1%
49325234
 
< 0.1%
154039221
 
< 0.1%
49434211
 
< 0.1%
150993198
 
< 0.1%
153889196
 
< 0.1%
151667190
 
< 0.1%
44837185
 
< 0.1%
49602185
 
< 0.1%
46359184
 
< 0.1%
Other values (23083)755193
99.7%
ValueCountFrequency (%)
3786929
< 0.1%
3787062
< 0.1%
3787121
 
< 0.1%
3787222
 
< 0.1%
3787317
 
< 0.1%
378748
 
< 0.1%
3787513
 
< 0.1%
3787614
 
< 0.1%
3787715
 
< 0.1%
3787813
 
< 0.1%
ValueCountFrequency (%)
1608425
< 0.1%
1608415
< 0.1%
1608405
< 0.1%
1607384
< 0.1%
1607373
< 0.1%
1607324
< 0.1%
1607314
< 0.1%
1607284
< 0.1%
1606714
< 0.1%
1606704
< 0.1%

FACILITY_ID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean840.2045517
Minimum421
Maximum2887
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.8 MiB
2021-06-15T09:15:43.516726image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum421
5-th percentile425
Q1433
median434
Q3436
95-th percentile2881
Maximum2887
Range2466
Interquartile range (IQR)3

Descriptive statistics

Standard deviation911.882779
Coefficient of variation (CV)1.085310449
Kurtosis1.209708576
Mean840.2045517
Median Absolute Deviation (MAD)1
Skewness1.79153985
Sum636328077
Variance831530.2026
MonotonicityNot monotonic
2021-06-15T09:15:43.616096image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
434316231
41.8%
436120950
 
16.0%
2881116869
 
15.4%
43391173
 
12.0%
42657359
 
7.6%
42538373
 
5.1%
4217252
 
1.0%
28845737
 
0.8%
28873236
 
0.4%
2886169
 
< 0.1%
ValueCountFrequency (%)
4217252
 
1.0%
42538373
 
5.1%
42657359
 
7.6%
43391173
 
12.0%
434316231
41.8%
436120950
 
16.0%
2881116869
 
15.4%
28845737
 
0.8%
2886169
 
< 0.1%
28873236
 
0.4%
ValueCountFrequency (%)
28873236
 
0.4%
2886169
 
< 0.1%
28845737
 
0.8%
2881116869
 
15.4%
436120950
 
16.0%
434316231
41.8%
43391173
 
12.0%
42657359
 
7.6%
42538373
 
5.1%
4217252
 
1.0%

DATE_VISIT
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size5.8 MiB

DURATION
Real number (ℝ≥0)

Distinct157
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean73.24852083
Minimum0
Maximum9168
Zeros2509
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size5.8 MiB
2021-06-15T09:15:43.736014image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile24
Q160
median60
Q390
95-th percentile180
Maximum9168
Range9168
Interquartile range (IQR)30

Descriptive statistics

Standard deviation45.36618966
Coefficient of variation (CV)0.619346154
Kurtosis2544.775317
Mean73.24852083
Median Absolute Deviation (MAD)0
Skewness15.52463523
Sum55474694
Variance2058.091164
MonotonicityNot monotonic
2021-06-15T09:15:43.869889image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
60381801
50.4%
90115506
 
15.3%
3097128
 
12.8%
18073124
 
9.7%
12027109
 
3.6%
1525635
 
3.4%
568542
 
1.1%
146526
 
0.9%
1685065
 
0.7%
844739
 
0.6%
Other values (147)12174
 
1.6%
ValueCountFrequency (%)
02509
0.3%
1135
 
< 0.1%
28
 
< 0.1%
314
 
< 0.1%
516
 
< 0.1%
636
 
< 0.1%
7474
 
0.1%
848
 
< 0.1%
925
 
< 0.1%
1072
 
< 0.1%
ValueCountFrequency (%)
91681
 
< 0.1%
60281
 
< 0.1%
20191
 
< 0.1%
18011
 
< 0.1%
18002
 
< 0.1%
9902
 
< 0.1%
9801
 
< 0.1%
96013
< 0.1%
9093
 
< 0.1%
9011
 
< 0.1%

MORNING
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct32
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6644295761
Minimum0
Maximum960
Zeros301222
Zeros (%)39.8%
Negative0
Negative (%)0.0%
Memory size5.8 MiB
2021-06-15T09:15:43.997938image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile1
Maximum960
Range960
Interquartile range (IQR)1

Descriptive statistics

Standard deviation3.376783526
Coefficient of variation (CV)5.0822294
Kurtosis34502.76255
Mean0.6644295761
Median Absolute Deviation (MAD)0
Skewness160.802169
Sum503205.075
Variance11.40266698
MonotonicityNot monotonic
2021-06-15T09:15:44.118342image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
1445006
58.8%
0301222
39.8%
38378
 
1.1%
22529
 
0.3%
9096
 
< 0.1%
18049
 
< 0.1%
6010
 
< 0.1%
6019
 
< 0.1%
157
 
< 0.1%
1205
 
< 0.1%
Other values (22)38
 
< 0.1%
ValueCountFrequency (%)
0301222
39.8%
0.091
 
< 0.1%
0.11
 
< 0.1%
0.181
 
< 0.1%
1445006
58.8%
1.0151
 
< 0.1%
1.033
 
< 0.1%
1.052
 
< 0.1%
1.0561
 
< 0.1%
1.063
 
< 0.1%
ValueCountFrequency (%)
9601
 
< 0.1%
9012
 
< 0.1%
6019
 
< 0.1%
3011
 
< 0.1%
18049
< 0.1%
1513
 
< 0.1%
1205
 
< 0.1%
9096
< 0.1%
611
 
< 0.1%
6010
 
< 0.1%

AFTERNOON
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.8 MiB
0
757342 
1
 
6
90
 
1

Length

Max length2
Median length1
Mean length1.00000132
Min length1

Characters and Unicode

Total characters757350
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0757342
> 99.9%
16
 
< 0.1%
901
 
< 0.1%

Length

2021-06-15T09:15:44.367831image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T09:15:44.689084image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0757342
> 99.9%
16
 
< 0.1%
901
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0757343
> 99.9%
16
 
< 0.1%
91
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number757350
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0757343
> 99.9%
16
 
< 0.1%
91
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common757350
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0757343
> 99.9%
16
 
< 0.1%
91
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII757350
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0757343
> 99.9%
16
 
< 0.1%
91
 
< 0.1%

EVENING
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.7577864366
Minimum0
Maximum90
Zeros201465
Zeros (%)26.6%
Negative0
Negative (%)0.0%
Memory size5.8 MiB
2021-06-15T09:15:44.766825image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile1
Maximum90
Range90
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.5241983383
Coefficient of variation (CV)0.6917494336
Kurtosis1511.570952
Mean0.7577864366
Median Absolute Deviation (MAD)0
Skewness10.58142663
Sum573908.8
Variance0.2747838979
MonotonicityNot monotonic
2021-06-15T09:15:44.869184image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
1546546
72.2%
0201465
 
26.6%
38389
 
1.1%
2932
 
0.1%
0.53
 
< 0.1%
153
 
< 0.1%
2.52
 
< 0.1%
42
 
< 0.1%
561
 
< 0.1%
0.31
 
< 0.1%
Other values (5)5
 
< 0.1%
ValueCountFrequency (%)
0201465
 
26.6%
0.31
 
< 0.1%
0.53
 
< 0.1%
1546546
72.2%
2932
 
0.1%
2.52
 
< 0.1%
38389
 
1.1%
42
 
< 0.1%
101
 
< 0.1%
153
 
< 0.1%
ValueCountFrequency (%)
901
 
< 0.1%
601
 
< 0.1%
561
 
< 0.1%
301
 
< 0.1%
261
 
< 0.1%
153
 
< 0.1%
101
 
< 0.1%
42
 
< 0.1%
38389
1.1%
2.52
 
< 0.1%

ADR_SCREENED
Boolean

MISSING

Distinct2
Distinct (%)< 0.1%
Missing79955
Missing (%)10.6%
Memory size1.4 MiB
False
675039 
True
 
2355
(Missing)
79955 
ValueCountFrequency (%)
False675039
89.1%
True2355
 
0.3%
(Missing)79955
 
10.6%
2021-06-15T09:15:44.945541image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

ADR_IDS
Categorical

HIGH CORRELATION
MISSING

Distinct8
Distinct (%)20.5%
Missing757310
Missing (%)> 99.9%
Memory size5.8 MiB
1,1,4# , ,
4,2# ,1
6#1
1#1
4#4
Other values (3)
10 

Length

Max length16
Median length3
Mean length6.076923077
Min length2

Characters and Unicode

Total characters237
Distinct characters10
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4,2# ,1
2nd row4,2# ,1
3rd row4,2# ,1
4th row4,2# ,1
5th row4,2# ,1

Common Values

ValueCountFrequency (%)
1,1,4# , ,8
 
< 0.1%
4,2# ,17
 
< 0.1%
6#16
 
< 0.1%
1#14
 
< 0.1%
4#44
 
< 0.1%
4,34
 
< 0.1%
4,3#6,4#8,4#10,33
 
< 0.1%
1#3
 
< 0.1%
(Missing)757310
> 99.9%

Length

2021-06-15T09:15:45.143501image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T09:15:45.232085image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
16
25.8%
110
16.1%
1,1,48
12.9%
4,27
11.3%
6#16
 
9.7%
1#14
 
6.5%
4,34
 
6.5%
4#44
 
6.5%
4,3#6,4#8,4#10,33
 
4.8%

Most occurring characters

ValueCountFrequency (%)
,62
26.2%
143
18.1%
#41
17.3%
436
15.2%
23
 
9.7%
310
 
4.2%
69
 
3.8%
27
 
3.0%
83
 
1.3%
03
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number111
46.8%
Other Punctuation103
43.5%
Space Separator23
 
9.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
143
38.7%
436
32.4%
310
 
9.0%
69
 
8.1%
27
 
6.3%
83
 
2.7%
03
 
2.7%
Other Punctuation
ValueCountFrequency (%)
,62
60.2%
#41
39.8%
Space Separator
ValueCountFrequency (%)
23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common237
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
,62
26.2%
143
18.1%
#41
17.3%
436
15.2%
23
 
9.7%
310
 
4.2%
69
 
3.8%
27
 
3.0%
83
 
1.3%
03
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII237
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
,62
26.2%
143
18.1%
#41
17.3%
436
15.2%
23
 
9.7%
310
 
4.2%
69
 
3.8%
27
 
3.0%
83
 
1.3%
03
 
1.3%

PRESCRIP_ERROR
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.8 MiB
0
753437 
1
 
3912

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters757349
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0753437
99.5%
13912
 
0.5%

Length

2021-06-15T09:15:45.470675image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T09:15:45.542874image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0753437
99.5%
13912
 
0.5%

Most occurring characters

ValueCountFrequency (%)
0753437
99.5%
13912
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number757349
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0753437
99.5%
13912
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Common757349
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0753437
99.5%
13912
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII757349
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0753437
99.5%
13912
 
0.5%

ADHERENCE
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.8 MiB
1
388667 
0
368682 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters757349
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
1388667
51.3%
0368682
48.7%

Length

2021-06-15T09:15:45.732757image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T09:15:45.804561image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
1388667
51.3%
0368682
48.7%

Most occurring characters

ValueCountFrequency (%)
1388667
51.3%
0368682
48.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number757349
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1388667
51.3%
0368682
48.7%

Most occurring scripts

ValueCountFrequency (%)
Common757349
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1388667
51.3%
0368682
48.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII757349
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1388667
51.3%
0368682
48.7%

NEXT_APPOINTMENT
Unsupported

REJECTED
UNSUPPORTED

Missing11
Missing (%)< 0.1%
Memory size5.8 MiB

DMOC_TYPE
Categorical

HIGH CORRELATION
MISSING

Distinct11
Distinct (%)< 0.1%
Missing578741
Missing (%)76.4%
Memory size5.8 MiB
Same Facility Refill
101317 
MMD
66675 
Individual delivery/home-based
 
5562
MMS
 
3800
Different Facility Refill (Private hospital/clinic)
 
1085
Other values (6)
 
169

Length

Max length51
Median length20
Mean length13.78559751
Min length3

Characters and Unicode

Total characters2462218
Distinct characters38
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMMS
2nd rowMMS
3rd rowMMS
4th rowMMS
5th rowMMD

Common Values

ValueCountFrequency (%)
Same Facility Refill101317
 
13.4%
MMD66675
 
8.8%
Individual delivery/home-based5562
 
0.7%
MMS3800
 
0.5%
Different Facility Refill (Private hospital/clinic)1085
 
0.1%
PMVs/Chemists56
 
< 0.1%
Other36
 
< 0.1%
CPARP33
 
< 0.1%
Fixed or ad hoc pick up points24
 
< 0.1%
Mobile van/other vehicle17
 
< 0.1%
(Missing)578741
76.4%

Length

2021-06-15T09:15:45.987673image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
facility102402
26.2%
refill102402
26.2%
same101317
25.9%
mmd66675
17.0%
individual5562
 
1.4%
delivery/home-based5562
 
1.4%
mms3800
 
1.0%
private1085
 
0.3%
different1085
 
0.3%
hospital/clinic1085
 
0.3%
Other values (15)350
 
0.1%

Most occurring characters

ValueCountFrequency (%)
i329482
13.4%
l320537
13.0%
e229412
 
9.3%
a217054
 
8.8%
212717
 
8.6%
M141023
 
5.7%
y107967
 
4.4%
m106935
 
4.3%
t105790
 
4.3%
S105117
 
4.3%
Other values (28)586184
23.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1709287
69.4%
Uppercase Letter525759
 
21.4%
Space Separator212717
 
8.6%
Other Punctuation6723
 
0.3%
Dash Punctuation5562
 
0.2%
Open Punctuation1085
 
< 0.1%
Close Punctuation1085
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i329482
19.3%
l320537
18.8%
e229412
13.4%
a217054
12.7%
y107967
 
6.3%
m106935
 
6.3%
t105790
 
6.2%
c104637
 
6.1%
f104572
 
6.1%
d22299
 
1.3%
Other values (11)60602
 
3.5%
Uppercase Letter
ValueCountFrequency (%)
M141023
26.8%
S105117
20.0%
R102438
19.5%
F102426
19.5%
D67760
12.9%
I5562
 
1.1%
P1207
 
0.2%
C95
 
< 0.1%
V56
 
< 0.1%
A36
 
< 0.1%
Other values (2)39
 
< 0.1%
Space Separator
ValueCountFrequency (%)
212717
100.0%
Open Punctuation
ValueCountFrequency (%)
(1085
100.0%
Other Punctuation
ValueCountFrequency (%)
/6723
100.0%
Close Punctuation
ValueCountFrequency (%)
)1085
100.0%
Dash Punctuation
ValueCountFrequency (%)
-5562
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2235046
90.8%
Common227172
 
9.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
i329482
14.7%
l320537
14.3%
e229412
10.3%
a217054
9.7%
M141023
 
6.3%
y107967
 
4.8%
m106935
 
4.8%
t105790
 
4.7%
S105117
 
4.7%
c104637
 
4.7%
Other values (23)467092
20.9%
Common
ValueCountFrequency (%)
212717
93.6%
/6723
 
3.0%
-5562
 
2.4%
(1085
 
0.5%
)1085
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII2462218
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i329482
13.4%
l320537
13.0%
e229412
 
9.3%
a217054
 
8.8%
212717
 
8.6%
M141023
 
5.7%
y107967
 
4.4%
m106935
 
4.3%
t105790
 
4.3%
S105117
 
4.3%
Other values (28)586184
23.8%

BODY_WEIGHT
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct83
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.07154842748
Minimum0
Maximum85
Zeros755118
Zeros (%)99.7%
Negative0
Negative (%)0.0%
Memory size5.8 MiB
2021-06-15T09:15:46.108494image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum85
Range85
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.557751708
Coefficient of variation (CV)21.77199084
Kurtosis956.14684
Mean0.07154842748
Median Absolute Deviation (MAD)0
Skewness28.28863313
Sum54187.13
Variance2.426590382
MonotonicityNot monotonic
2021-06-15T09:15:46.291448image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0755118
99.7%
30187
 
< 0.1%
20125
 
< 0.1%
15125
 
< 0.1%
10113
 
< 0.1%
25105
 
< 0.1%
18102
 
< 0.1%
1487
 
< 0.1%
1377
 
< 0.1%
5474
 
< 0.1%
Other values (73)1236
 
0.2%
ValueCountFrequency (%)
0755118
99.7%
1.510
 
< 0.1%
1.626
 
< 0.1%
32
 
< 0.1%
57
 
< 0.1%
632
 
< 0.1%
6.24
 
< 0.1%
721
 
< 0.1%
843
 
< 0.1%
8.057
 
< 0.1%
ValueCountFrequency (%)
854
 
< 0.1%
833
 
< 0.1%
807
< 0.1%
783
 
< 0.1%
754
 
< 0.1%
743
 
< 0.1%
727
< 0.1%
693
 
< 0.1%
683
 
< 0.1%
6713
< 0.1%

Interactions

2021-06-15T09:15:22.582444image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:23.050242image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:23.312081image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:23.565607image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:23.818067image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:24.075962image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:24.327882image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:24.583853image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:24.846762image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:25.095170image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:25.341005image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:25.589815image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:25.839607image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:26.086807image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:26.329161image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:26.584920image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:26.826082image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:27.064380image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:27.304303image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:27.547275image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:27.787026image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:28.022949image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:28.276746image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:28.512949image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:28.741996image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:28.979160image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:29.369845image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:29.607008image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:29.842001image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:30.104120image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:30.351733image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:30.595119image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:30.828226image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:31.065757image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:31.301399image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:31.541084image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:31.796097image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:32.039656image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:32.277647image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:32.506124image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:32.735337image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:32.967597image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:33.204057image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:33.457388image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:33.698117image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:33.937648image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:34.170332image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:34.403542image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:15:34.634088image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-06-15T09:15:46.425843image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-06-15T09:15:46.607348image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-06-15T09:15:46.788452image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-06-15T09:15:46.984308image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-06-15T09:15:35.172752image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-06-15T09:15:36.851633image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-06-15T09:15:39.641269image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-06-15T09:15:40.414588image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

StateL.G.AFacility NameRegimen LineRegimenPHARMACY_IDPATIENT_IDFACILITY_IDDATE_VISITDURATIONMORNINGAFTERNOONEVENINGADR_SCREENEDADR_IDSPRESCRIP_ERRORADHERENCENEXT_APPOINTMENTDMOC_TYPEBODY_WEIGHT
0AdamawaGireiGirei B ClinicART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)209355379924212019-11-15 00:00:00300.001.0NoNaN002019-12-10 00:00:00NaN0.0
1AdamawaGireiGirei B ClinicIsoniazid Preventive Therapy (IPT)Isoniazid 300mg209362379874212020-02-21 00:00:00561.000.0NoNaN002020-04-15 00:00:00NaN0.0
2AdamawaGireiGirei B ClinicART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)209368381014212019-11-18 00:00:00600.001.0NoNaN002020-01-13 00:00:00NaN0.0
3AdamawaGireiGirei B ClinicART First Line AdultTDF(300mg)+3TC(300mg)+EFV(600mg)209375379374212018-10-02 00:00:00600.001.0NaNNaN002018-11-02 00:00:00NaN0.0
4AdamawaGireiGirei B ClinicART First Line AdultTDF(300mg)+3TC(300mg)+EFV(600mg)209382378704212019-06-03 00:00:00600.001.0NoNaN002019-08-02 00:00:00NaN0.0
5AdamawaGireiGirei B ClinicART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)209389380944212020-02-27 00:00:00300.001.0NoNaN002020-03-27 00:00:00NaN0.0
6AdamawaGireiGirei B ClinicCotrimoxazole (CTX) ProphylaxisCotrimoxazole 960mg209396379764212019-01-24 00:00:00301.000.0NoNaN012019-02-21 00:00:00NaN0.0
7AdamawaGireiGirei B ClinicART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)209402380244212020-03-23 00:00:00900.001.0NoNaN002020-06-19 00:00:00NaN0.0
8AdamawaGireiGirei B ClinicART First Line AdultTDF(300mg)+3TC(300mg)+EFV(600mg)209407378704212017-06-26 00:00:00600.001.0NaNNaN002017-08-27 00:00:00NaN0.0
9AdamawaGireiGirei B ClinicART First Line AdultTDF(300mg)+3TC(300mg)+EFV(600mg)209415380314212019-02-22 00:00:00301.001.0NaNNaN002019-03-21 00:00:00NaN0.0

Last rows

StateL.G.AFacility NameRegimen LineRegimenPHARMACY_IDPATIENT_IDFACILITY_IDDATE_VISITDURATIONMORNINGAFTERNOONEVENINGADR_SCREENEDADR_IDSPRESCRIP_ERRORADHERENCENEXT_APPOINTMENTDMOC_TYPEBODY_WEIGHT
757339AdamawaMubi SouthMubi General HospitalART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)4080343517734342021-05-28 00:00:001800.001.0NoNaN002021-11-10 00:00:00Same Facility Refill0.0
757340AdamawaMubi SouthMubi General HospitalART Second Line AdultTDF(300mg)+3TC(150mg)+ATV/r(300/100mg)4080344515304342021-05-28 00:00:00901.001.0NoNaN002021-08-20 00:00:00Same Facility Refill0.0
757341AdamawaMubi SouthMubi General HospitalCotrimoxazole (CTX) ProphylaxisCotrimoxazole 960mg4080345485064342021-04-16 00:00:00901.000.0NoNaN002021-10-22 00:00:00Same Facility Refill0.0
757342AdamawaMubi SouthMubi General HospitalCotrimoxazole (CTX) ProphylaxisCotrimoxazole 960mg40803461592224342021-04-28 00:00:00901.000.0NoNaN002021-07-21 00:00:00Same Facility Refill0.0
757343AdamawaMubi SouthMubi General HospitalART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)4080347455274342021-05-27 00:00:00900.001.0NoNaN002021-07-21 00:00:00Same Facility Refill0.0
757344AdamawaMubi SouthMubi General HospitalART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)4080349508544342021-05-28 00:00:00900.001.0NoNaN002021-08-20 00:00:00Same Facility Refill0.0
757345AdamawaMubi SouthMubi General HospitalART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)4080350450804342021-05-28 00:00:001800.001.0NoNaN002021-11-12 00:00:00Same Facility Refill0.0
757346AdamawaMubi SouthMubi General HospitalART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)4080351478114342021-05-28 00:00:001800.001.0NoNaN002021-11-12 00:00:00Same Facility Refill0.0
757347AdamawaMubi SouthMubi General HospitalART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)4080352449694342021-05-28 00:00:001800.001.0NoNaN002021-11-12 00:00:00Same Facility Refill0.0
757348AdamawaMubi SouthMubi General HospitalIsoniazid Preventive Therapy (IPT)Isoniazid 300mg40803531592224342021-04-28 00:00:00841.000.0NoNaN002021-07-21 00:00:00Same Facility Refill0.0